Deep Edge-Based Fault Detection for Solar Panels

Solar panels may suffer from faults, which could yield high temperature and significantly degrade their power generation. To detect faults of solar panels in large photovoltaic plants, drones with infrared cameras have been implemented. Drones may capture a huge number of infrared images. It is not realistic to manually analyze such a huge number of infrared images. To solve this problem, we develop a Deep Edge-Based Fault Detection (DEBFD) method, which applies convolutional neural networks (CNNs) for edge detection and object detection according to the captured infrared images. Particularly, a machine learning-based contour filter is designed to eliminate incorrect background contours. Then faults of solar panels are detected. Based on these fault detection results, solar panels can be classified into two classes, i.e., normal and faulty ones (i.e., macro ones). We collected 2060 images in multiple scenes and achieved a high macro F1 score. Our method achieved a frame rate of 28 fps over infrared images of solar panels on an NVIDIA GeForce RTX 2080 Ti GPU.


Introduction
According to the latest report [1], renewable energy sources such as solar and wind systems will meet 88% of the global energy demand by 2050.As an alternative and a critical complement to thermal power, solar energy plays a crucial role in supplying more and more pollution-free electric power.At the same time, with the rapid development of the photovoltaic industry, the daily maintenance and operations of photovoltaic power stations are facing significant challenges, particularly the faults of solar panels.Studies reveal that the annual solar power loss due to various faults, such as snail trails, discoloration, and corrosion [2], is 18.9% [3].Patrol is a crucial part of photovoltaic power stations' safety production, which can help ensure that all electrical systems and wires meet legal safety standards.
As solar panels are generally laid on hillsides, plains, swamps, lakes, rooftops, and other inaccessible areas, one patrol task may take several weeks by several patrollers, which is challenging to meet the demand in terms of efficiency and frequency.With the wide applications of Unmanned Aerial Vehicles (UAVs) in surveillance [4], wildlife protection [5], and infrastructural inspection [6], more and more photovoltaic power stations are using drones for inspection.UAVs can capture video streams in just a few hours and surpass geographical limitations.Due to the characteristics of low cost, high flexibility, and simple operations, UAVs are very popular in equipment inspection.Infrared (IR) imaging allows fast and straightforward identification of overheated solar panels, also known as thermography.The low signal-to-noise ratio of thermography can better highlight faults than normal visible images.
Among the major faults of solar panels, dotted hotspots are usually formed due to shading or breakage, while circuits cause rectangular hotspots.Although UAVs can free patrollers from complicated tasks, the analysis of infrared images has always troubled them.
Inspectors have to mark the fault location from a huge number of infrared images, which is time-consuming and depends on the staff's experience.The purpose of this paper is to develop an automatic fault detection algorithm for infrared images of solar panels to reduce the inspection cost of the photovoltaic industry.
Automatic fault detection of photovoltaic panels can be decomposed into two subtasks: location and classification, i.e., where and which category faults of a solar panel are.It is a challenging task and has caught more and more attention.There are three types of images, IRT (infrared thermography) [7][8][9][10][11], EL (electroluminescence) [12][13][14][15][16], and PL (photoluminescence) [17,18].Traditionally, methods based on hand-crafted features [9,11] have been used to determine the location of faults.After obtaining the positions of faults, adopting SVMs (support vector machines) to classify faults is a popular choice of traditional methods [11].Recent years have witnessed significant progress in fault detection using deep CNNs (convolutional neural networks).Due to the ability of deep learning to express multi-level features, CNN-based methods surpass traditional methods in tasks, such as image classification, target detection, and semantic segmentation.With the rapid development of deep learning in different tasks, many methods have shined light on photovoltaic fault detection.Early methods [8,13,15,17] only solved classification tasks, using CNNs, such as VGG19, to learn deep abstract data representations.Later object detection methods [7,14,16] can output the location and category of faults in an end-to-end manner, e.g., Faster RCNN [19] and YOLO series [20][21][22][23][24]. Few semantic segmentation methods [12] demonstrated their feasibility in a small number of faulty samples.Nevertheless, the above techniques are prevented from having practical applications due to the following reasons: First, EL images need to be acquired by professional instruments, which is only applicable in laboratories.Second, traditional descriptor-based approaches require complicated parameter settings and entirely rely on experts' experience.Third, the diversity of drone flight attitudes and altitudes may cause fault detection algorithms to fail in some cases.Finally, class imbalance is the most common problem in photovoltaic (PV) fault detection (the fault rate is normally lower than 4%), showing biased results.At the same time, automatic fault detection of solar panels has become a vital issue for efficient daily inspections.
To address the above challenges, we pay attention to edge detection, a fundamental computer vision problem.Recent literature has proved that CNNs have greatly surpassed traditional methods in edge detection tasks on public datasets, and modern CNNs can learn rich, hierarchical edge representations.Why not use CNN-based edge detection to locate solar panels?On the one hand, an excellent edge detector can be trained on the BSDS500 [25] dataset, which inspires us to collect datasets at a small cost.On the other hand, infrared images are only affected by temperature so that the edges of solar panels are clearly visible under sufficient sunlight.The edge features are still highly adaptable to locate solar panels for UAV aerial photography with scale and rotation ambiguity characteristics.Based on the above analysis, we construct a general framework for automatic fault detection of solar panels.As shown in Figure 1, the framework consists of three parts: edge detection, contour filter, and classification.The goal is to find the location of faulty solar panels from infrared images.Edge detection finds the positions of all solar panels in the image, which is performed by finding the contour of the mask image of the edge.Then the contour filter algorithm will delete the background contour to eliminate the clutter in the background.Lastly, solar panels are classified into normal, dotted faulty, and rectangular faulty.Actually, both dotted faulty and rectangular faulty solar panels compose faulty panels, which should be handled.The main contributions of this paper are listed below.

1.
We develop a novel fault detection pipeline of solar panels, which consists of three steps: (a) edge detection, (b) contour filter, and (c) classification.

2.
We adapt existing CNNs for fault detection of solar panels.To the best of our knowledge, ours is the first CNN-based edge detection approach in this task.The proposed bottom-up self-attention structure leads to more detailed edge location information.

3.
We collect and annotate 1200 images in different photovoltaic power plants (desert, mountain, roof, water, woodland) and achieve a high macro F1 score in 860 testing images.

Related Work 2.1. Edge Detection
As one of the fundamental image processing tasks, edge detection has a wealth of research literature.Early traditional methods used operators such as Sobel, Prewitt [26], and Canny [27] to pay attention to the gradient changes in local intensity and brightness in the image.The weakness of traditional methods lies in the fact that local noise is fatal to the gradient information so that they can hardly be applied in practical applications.Later, machine learning-based methods become the mainstream of edge detection.Researchers construct feature descriptors by combining intensity, color, and gradient and then use complex learning paradigms to obtain edge intensity, such as in [28][29][30].Learning-based methods usually yield good results in simple scenarios, but may suffer from a lack of effective acceptable edge representation.With the rise of CNN, deep learning-based edge detectors have been proposed recently.Xie et al. [31] proposed a fast and accurate edge detector, Holistically Nested Edge Detection (HED), which concatenates side outputs of different scales and uses 1 × 1 convolution to fuse them.Wang et al. [32] then used subpixel convolution, instead of bilinear interpolation or deconvolution, which is beneficial for accurate edge localization.Xu et al. [33] first introduced Attention-Gated Conditional Random Fields (AG-CRFs) and considered attention variables as gates, resulting in rich and complementary feature expressions.Later, Liu et al. [34] exploited all the convolutional layers of each stage of the primary network to extract rich features.Deng et al. [35] proposed a new loss function to generate fine edges, which alleviated the problem of class imbalance.He et al. [36] developed a Bi-Directional Cascaded Network to perform a different supervision on multi-scale features, superior to human perception on the BSDS500 dataset.Su et al. [37] designed a novel pixel difference convolution inspired by traditional edge detection operators, which can achieve more than 100 FPS (frames per second) on an NVIDIA RTX 2080Ti GPU.

Fault Detection of PV Panels
The purpose of fault detection of solar panels is to implement image processing technology to reduce the burdens of operation and maintenance.Existing methods can be roughly divided into three categories: (1) statistical features, (2) machine vision, and (3) deep learning.Dotenco et al. [9] statistically modeled temperature to locate solar panels and identify overheated areas such as Gaussian distribution, median, and histogram.Machine learning methods focus on the texture features of the image after filtering.For example [10,11], all improve the Canny operator to obtain the diverse expression of the edge.Deep learning methods achieve promising performance, and find many large-scale applications.Li et al. [17] proposed an unsupervised clustering method for the potential embedding of data features to overcome missing labels caused by sparse fault samples in industrial situations.Akram et al. [8] transferred learning from the EL dataset and increased the average accuracy rate from 98.67% to 99.23%.Su [16] embedded the channel and spatial position attention mechanism in Faster RCNN's RPN (Region Proposal Network) to extract more refined defect region proposals.On EL images, Rahman et al. [12] defined a hybrid loss function (dice loss combined with focal loss) to train a multi-attention UNet network to help overcome the problem of data imbalance.Although these methods have used CNNs to promote the development of the field to some extent, they have not fully considered the characteristics of photovoltaic images taken by drones, namely, scale and rotation ambiguity.To resolve this issue, we will propose an edge-based framework for complete location and classification.

Materials and Methods
This section presents our Deep Edge-Based Fault Detection (DEBFD) method for solar panels, including edge detection, contour filter, and classification.For the captured infrared images, the pseudocolor mode is fulgurite.

Edge Detection
Our edge detection network, called SEPAN, uses a backbone modified by VGG16 [38] and a neck with a squeeze-and-excitation path aggregation structure.As shown in Figure 2, the input of SEPAN is an infrared image, and its output is an edge map, which represents the probability for each pixel to belong to edges.

Backbone
As in [31,34], we pre-train VGG16 on ImageNet as our backbone.VGG16 has 5 stages and 13 convolutional layers, whose each stage is followed by a pooling layer.From stage 1 to stage 5, the receptive field keeps increasing and contains more and more contextual information.Unfortunately, the receptive field increase is still limited for our task.The experimental results of HED [31] will show that the side-output layer 5 (connected with stage 5) produces relatively low performance.Inspired by this observation, we keep only the first three pool layers and merge stages 4 and 5. Like RCF [34], the convolutional layer of each stage is connected to a 1 × 1 convolutional layer with a channel depth of 21.Then the results of each stage are added element-wise to produce hybrid features.

Squeeze-and-Excitation Path Aggregation Structure
Existing works [39][40][41][42] have obtained high-level semantic features by fusing multiscale information on object detection and semantic segmentation tasks.Our network design aims to design effective edge multi-scale representations.Top-down and bottomup structures are two common ways of semantic fusion.Unlike the top-down one, a clean horizontal connection path is established from low to high.Edge detection can be regarded as a special kind of semantic segmentation, and the bottom-up path enhancement is critical for positioning in semantic segmentation.It facilitates high-level features to access low-dimensional positioning information.For more experiment results, please refer to Section 4.3.We also added channel attention to the bottom-up structure, Squeeze-and-Excitation Block (SE Block), which was first proposed by [43].Specifically, we replaced each lateral vanilla convolution with an SE Block.As shown by the dotted line in Figure 2, the SE Block sequentially generates a set of self-attention weights with the number of channels through the average pooling, ReLU, and fully connected layers, and multiplies it with the original input feature element-wise to generate the result.With the SE Block, the bottom-up structure can better map the channel dependency and access the global information.Therefore, it can better recalibrate the filter outputs, which leads to good edge representation.Our framework accepts multi-scale features {C 1 , C 2 , C 3 , C 4 } from the backbone and outputs enhanced features {P 1 , P 2 , P 3 , P 4 } with the same spatial size.From C1 to C4, the spatial size is gradually reduced with a factor of 2. Each output layer adopts a higher-resolution feature map P I and a lower-resolution map C I+1 through element-wise addition, generating a new feature map P I+1 .C I+1 goes through an SE Block, and P I is adjusted to the same spatial size as C I+1 through bilinear interpolation.Note that P 1 is directly generated by C 1 .In this structure, an SE Block does not change the number of channels in the feature map.

Loss Function
After obtaining {P 1 , P 2 , P 3 , P 4 }, they are reduced to a single channel map by a 1 × 1 convolutional layer, which is then interpolated to the original size, followed by a Sigmoid function to generate the output edge maps {O 1 , O 2 , O 3 , O 4 }.We use the following weighted cross-entropy loss: where X denotes the training image, and Y = y j , j = 1, . . ., |Y| is the corresponding ground truth pixel set.Y + and Y − denote the set of edge pixels and non-edge pixels in the ground truth, respectively.p j is the predicted value of y j .|Y + | and |Y − | denote the pixel numbers of an edge pixel set and a non-edge pixel set of the ground truth, respectively.
The hyper-parameter λ is designed to balance the importance of edges and non-edges, and is set as λ = 1.1 in our experiments.In particular, we concatenated all layers of all scales to generate a fusion result O f use , which is used for evaluation.During training, are all involved in the loss function calculation.Thus, the total loss function is defined as follows:

Contour Filter
According to [9][10][11], an effective filtering mechanism can select the correct contour from numerous candidates.Having obtained the binary mask map in Section 2.1, the contour borders are computed by the algorithm in [44].Then the minimal enclosing parallelogram of each contour is determined to represent a candidate.We define C as the set of N candidate contours, and a general filtering algorithm can be described as f (C), which is a subset of C. The workflow of the proposed filtering strategy can be divided into the following four stages, including minimal enclosing parallelogram, coarse filter, main direction filter, and RANSAC filter.Figure 3 shows parallelograms filtered by our proposed filtering strategies.The specific details will be described later.

Minimal Enclosing Parallelogram
As the flight attitude of the UAV is constantly changing in the inspection, the transformation of the solar panel to the imaging plane can be regarded as an affine transformation; i.e., the rectangular solar panel may be mapped into a parallelogram in the infrared image.Given the contour points, we convert each contour point set into the clockwise vertices of a convex polygon C. Then e and v represent an edge and a vertex of C, respectively.The pair (e, v) is called as an antipodal pair of C if v has the farthest Euclidean distance from e among all vertices.For any convex polygon C, we denote P c as the minimal enclosing parallelogram among all parallelogram candidates.According to [45], each axis of P c contains an edge of C. One example is shown in Figure 4. Therefore, the minimal enclosing parallelogram for the clockwise convex polygon C can be calculated as follows: 1.
Obtain the list L = {(e i , v i ); i = 1, ..., N} by the Rotating Calipers algorithm, where vertex v i is the farthest from edge e i among all the vertices of C; 2.
Sequentially traverse the list L and select e j , v j and (e k , v k ) to determine the unique parallelogram (j ∈ (1, N), k ∈ (j + 1, N)); 3.
Repeat Step 2 until all candidate parallelograms have been processed.

Coarse Filter
We use the coarse filter to identify parallelogram candidates that are essentially impossible to be solar panels.Define the list P = {p i ; i = 1, ..., N}, which contains all parallelograms after calculating the minimal enclosing parallelogram of all contours.Considering a priori knowledge about the shape and location of solar panels, the coarse filtering approach consists of the following operations: 1.
Filter those parallelograms, e.g., No. 1 in Figure 3, that do not satisfy t min area < A p i < t max area , where A p i is the area of p i , and t min area and t max area are preset thresholds; 2.
Exclude parallelograms, e.g., No. 2 in Figure 3, when the ratio r p i = w p i /h p i is larger than a threshold t ratio , where h p i and w p i represent the short side and the long side of p i , respectively; 3.
Filter those parallelograms, e.g., No. 3 in Figure 3, without candidates nearby, i.e., those parallelograms whose distance to that parallelogram is greater than a threshold t d .
Since we do not want that coarse filtering will remove any correct candidate, the above thresholds should be set reliably enough.t min area and t max area are 500 and 30,000, respectively, while the usual area is around 3000, and t ratio is set to 3.0, while the aspect ratio of solar panels is around 1.5.t d is set to 1.5h p i .

Main Direction Filter
In this study, the orientation θ p i of the parallelogram is defined as the direction of the long side, where θ p i ∈ [0, π).The underlying philosophy of the main direction filtering is that the orientation θ p i falls within the range of (θ m − t θ , θ m + t θ ), where θ m represents the main direction of the whole image with the allowed variation value t θ .According to the distribution pattern of solar panels, we find that they are oriented in approximately the same direction in most cases.Based on the above findings, we obtain grids adjacent to each other by equating the angle ranges.Then each parallelogram will vote on these grids.Finally, the grid with the highest score will decide the main direction.The proposed algorithm is summarized as follows: 1.
Divide the range of Update the scores of the grid G i , which p i belongs to, and its neighboring grids , where t o is the threshold of the voting strategy; 3.
The direction of the highest scoring grid is taken as the main direction θ m .Filter parallelograms that do not satisfy the condition |θ p i − θ m | < t θ .Figure 5 shows an example of N o = 10, t o = 1.For an extreme case that has more than one continuous grid with the highest score, we choose the middle angle of the continuous range as the main direction.

RANSAC Filter
In our study, RANSAC is implemented to eliminate those isolated parallelogram candidates whose centroids are barely in a straight line with the others.Define the set of centroids of all parallelograms as S = {s i , i = 1, ..., N}.Then a simple filter algorithm is given below.

1.
Find an optimal straight line l using the standard RANSAC algorithm, which has the maximum number of interior points; 2.
Remove the interior points of line l from set S to obtain set S ′ = {s i , i = 1, ..., N and i / ∈ l}; 3.
Repeat the above steps until S = ∅.Then filter the optimal line with the number of interior points less than the threshold t r .
It is not practical to apply the above method in realistic scenarios.As shown in Figure 6, randomly selecting two points to construct a straight line can cause a real solar panel to be filtered incorrectly.At the same time, the algorithm relies on the selection of suitable thresholds, and the iterative computation has high time complexity.Taking these factors into account, an improved RANSAC filtering algorithm is constructed as follows: 1.
For each centroid s i , choose two lines l w and l h , where l w and l h represent the direction of the long and short axes of the parallelogram p i , respectively; 2.
Calculate the number of interior points of 2N lines, where l w and l h have interior point thresholds of 0.3w p i and 0.3h p i ; 3.
Select one of the optimal straight lines l.Since there is an overlap of the interior points of the lines, it is necessary to remove those interior points that are part of the optimal line among the interior points of the non-optimal lines; 4.
Repeat Step 3 until all centroids are assigned to an optimal line, filtering out the optimal lines whose number of interior points is less than the threshold t r .
The improved RANSAC algorithm only needs to compute the interior points of 2N lines (with the time complexity O(N)) and can adaptively adjust the threshold value of the distance about the interior points.

Classification
After filtering, we apply a perspective transformation to a standard rectangular image of fixed size for each parallelogram candidate and then use the YOLOv5 model to classify them.As different faults in the infrared images differ only in shape, we treat all hotspots as the same category.In other words, both dotted hotspots and rectangular hotspots are labeled as the same category, and they are distinguished by the bounding box of detection results.YOLOv5 is a family of object detection architectures, including YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x, which trade off speed and accuracy.Due to the importance of real time, we choose YOLOv5s, using CSPDarknet53 with an SPP layer as the backbone, PANet as the neck, and YOLO detection head.We modify the weight ratio of classification loss to regression loss to 3:1 and manually adjust the anchor box size, and use the official settings of YOLOv5s for the rest of the settings.Define the image I of w × h to obtain the bounding box result B = {b i = (x i , y i , w i , h i ), i = 1, ..., N} by YOLOv5.To reduce the false alarm rate of small target hotspots, those bounding boxes that satisfy the condition w i < 1 m w or h i < 1 n h are ignored.The above condition is determined by the size of solar panels (consisting of 10 × 6 battery cells).We set m = 20, n = 12; i.e., we will ignore the half cell size hotspot.Any image I will be divided into the following three categories: 1.

Datasets and Implementation
We collect a dataset of infrared solar panel images taken by a DJI Manifold 2 UAV, including a train set (1200 images) and a test set (860 images).These images are carefully annotated by Computer Vision Annotation (CVAT, https://cvat.org).We use perspective transformation to crop out training images with a 128 × 64 pixel size and manually select 15,100 of them for training YOLOv5.The experimental settings of key components are presented below.
SEPAN is trained for nine epochs during edge detection training using a stochastic gradient descent (SGD) optimizer.A data augmentation of random flip, random rotation, and random gamma is applied to the input image (640 × 512 pixels).We set the batch size to 8, and the learning rate is set to 1 × 10 −6 , and divided by ten after every three epochs.At the same time, the weight decay is 0.0002, and the momentum is set to 0.9.The backbone is initialized with the pre-trained VGG16 [38] model, and other weights are initialized from a zero-mean Gaussian distribution with a standard deviation of 0.01.The training of all experiments is based on the PyTorch library [46] and is carried out on a GeForce GTX 2080Ti GPU.Since the filtering process does not require training, we only set the relevant thresholds as N o = 180, t o = 15 • for the main direction filtering.We use an SGD optimizer for the training of YOLOv5 and use 0.01 as the initial learning rate with the cosine schedule.
Here, the training epochs are set to 300, the batch size is 64, the mini-batch size is 16, the IoU threshold is set to 0.20, and the momentum and weight decay are 0.937 and 0.0005, respectively.We keep the same mosaic augmentation as [23,24] for 15,100 crop images.Due to solid class imbalance, we record recall, precision, and macro F1 score instead of accuracy.
In addition, we also evaluate the edge detection method on BSDS500.There are 200, 100, and 200 images in a BSDS500 training set, validation set, and test set, respectively.We use the same data augmentation as in [31,32,34,37], including flipping, scaling, and rotation to enlarge the training set 96 times.At the same time, we also add an additional flipped PASCAL VOC Context dataset to the training set.The model settings and operating environment are the same as the dataset of solar panel images.F-measures at both the Optimal Dataset Scale (ODS) and the Optimal Image Scale (OIS) are recorded for evaluation.

Results on Solar Panel Image Dataset
We evaluate the performance of our method on different types of PV plants.The final detection results are shown in Figure 7.Note that both dotted-faulty and rectangular-faulty solar panels compose faulty panels, which should be handled.It is true that dotted-faulty solar panels and rectangular-faulty solar panels could be mistaken.Fortunately, that mistake does not matter from a practical perspective.Therefore, dotted-faulty solar panels and rectangular-faulty solar panels can be combined into faulty ones, which are also named as macro ones.It can be seen that, except for the PV plant on water, the rest have strong background interference.We show comparison results in Table 1.It achieves a 0.9444 macro F1 score in all test data and demonstrates the strategy's effectiveness.We can observe the following phenomena: (1) The normal has reached more than 99% of the indicators due to an imbalance class effect.The dotted faults are more difficult to detect than the rectangular ones as dotted faults are usually caused by a smaller fault area.(2) Considering the complexity of PV plants' background area, our method can achieve better performance on the roof and the water.(3) It is crucial to find faults in practice.Therefore, our method is more inclined to a high recall rate at the cost of a little bit high false alarm rate.To demonstrate the effectiveness of the contour filter strategy, we conduct ablation studies on the dataset of solar panels.The purpose of filtering is to remove irrelevant contours of the background after edge detection.Therefore, the accuracy can directly judge the effectiveness of filtering.As shown in Table 2, we have explored the impact of a minimal enclosing parallelogram, coarse filtering, main direction filtering, and RANSAC filtering on accuracy.Note that T/F represents the numbers of contours filtered correctly and incorrectly, respectively.We found that a coarse filter can filter out the most background contours, increasing the accuracy by about 1.5%.The main direction filtering and RANSAC filtering are designed to identify the background contours whose shape is similar to that of solar panels.Thanks to the more accurate shape representation, the minimal enclosing parallelogram also improves the accuracy by about 0.5% compared with the rotated rectangle.

Results on BSDS500
We compare our methods with other edge detection approaches, including both traditional ones and learning-based ones, in Table 3 and Figure 8.The results show that ours reached 0.809 ODS, which outperforms most of the deep learning-based edge detectors and confirms the effectiveness of our bottom-up self-attention structure.When the baseline does not use SEPAN, the ODS of 0.805 is achieved, which still exceeds most methods, including HED [31], CED [32], and AMH-Net [33].Note the SEPAN structure has almost no effect on the inference speed, but increases the accuracy by about 0.4%.In addition, we also provide a lightweight model, SEPAN-tiny, which only replaces the backbone network with separable convolution and achieves 128fps.Figure 8 shows a visual comparison among several current edge detectors and SEPAN on the BSDS500 test set.[32] 0.794 0.811 -AMH-Net [33] 0.798 0.829 -RCF [34] 0.806 0.823 30 † LPCB [35] 0.808 0.824 30 † BDCN [36] 0.820 0.838 47 ‡ PiDiNet [37] 0  The results of top-down and bottom-up structural ablation experiments are shown in Table 4.In this experiment, we use the 200 images of the BSDS500 training set and the VOC dataset for training and record the metrics on the BSDS500 validation set.As expected, the bottom-up one is easier to obtain more precise positioning in edge detection than the topdown one, and can achieve better performance.We also add an SE Block as a comparison, producing better results.Interestingly, the SE Block brings a 0.2% improvement for the bottom-up structure, but almost nothing for the top-down one.We believe that the bottomup structure is more suitable for the multi-scale fusion of edge detection, which is verified by the experimental results.

Conclusions
This paper presents a deep edge-based application for fault detection of solar panels.Our method, DEBFD, takes infrared images of solar panels as input and detects dotted and rectangular faults.DEBFD consists of three parts-edge detection, contour filter, and classification-which are fulfilled by the advanced deep learning networks SEPAN and YOLOv5 and a machine learning contour filter, respectively.DEBFD achieved a high macro F1 score on the 860 images taken from real scenarios.It is worth mentioning that SEPAN is our proposed bottom-up self-attention network and can effectively and accurately locate edges.Moreover, experiments on public datasets also confirm the effectiveness of our methods.In summary, DEBFD shows promising results in multiple real environments and is expected to find broad applications of PV automatic detection.Note that DEBFD may mistake dotted faults and rectangular faults of solar panels.In the future, we will explore more distinct features to distinguish dotted and rectangular faults and facilitate automatic fault classification.

Figure 1 .
Figure 1.The pipeline of DEBFD includes three steps, edge detection (SEPAN), contour filter, and classification (YOLOv5).The task of DEBFD is to locate and identify normal, dotted, rectangular solar panels in infrared images.

Figure 3 .
Figure 3.An example of filtering policies.Note that different numbered parallelograms represent different types of filtered background contours.

Figure 4 .
Figure 4.An example of minimal enclosing parallelogram.Note that an antipodal pair includes a red edge and a blue vertex.

Figure 5 .
Figure 5.An example of the main direction filtering (N o = 10, t o = 1).Green and purple represent the voted and unvoted grids, respectively.A table of the voting results is also shown at the bottom.

Figure 6 .
Figure 6.An example of the RANSAC filter when the red line is incorrectly filtered: (a) the standard RANSAC algorithm; (b) the improved RANSAC algorithm.

Table 1 .
Indicators of three types of solar panels in different scenarios (desert, mountain, roof, water, and woodland).Sum represents images in all scenarios.We record the precision (P), recall (R), F1 score.Note that the best ones are colored in red.

Table 2 .
Ablation studies on a minimal enclosing parallelogram, coarse, main direction (MD), and RANSAC filter.T/F represents the numbers of correct and incorrect excluded contours, respectively, and we record the accuracy of the final results.Some test results under different environments.By the way, we mark the normal, dotted, and rectangular solar panels as green, dark blue, and light blue parallelograms.

Table 3 .
Comparison with other methods on the BSDS500 dataset.‡ indicates the speed with our implementations based on an NVIDIA RTX 2080 Ti GPU.† means the cited GPU speed.

Table 4 .
Ablation on top-down and bottom-up structure.The training data are the BSDS500 training set and VOC dataset, and the evaluation data are the BSDS500 validation set.